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Abstract 

We introduced least absolute shrinkage and selection operator (lasso) in obtaining periodic signals in 
unevenly spaced time-series data. A very simple formulation with a combination of a large set of sine 
and cosine functions has been shown to yield a very robust estimate, and the peaks in the resultant 
power spectra were very sharp. We studied the response of lasso to low signal-to-noise data, asymmetric 
signals and very closely separated multiple signals. When the length of the observation is sufficiently long, 
all of them were not serious obstacles to lasso. We analyzed the 100-year visual observations of S Cep, 
and obtained a very accurate period of 5.366326(16) d. The error in period estimation was several times 
smaller than in Phase Dispersion Minimization. We also modeled the historical data of R Set, and obtained 
a reasonable fit to the data. The model, however, lost its predictive ability after the end of the interval 
used for modeling, which is probably a result of chaotic nature of the pulsations of this star. We also 
provide a sample R code for making this analysis. 
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1. Introduction 

There has been a wealth of history in analyzing period- 
icities in time-scries data, particularly for variable stars. 
Determination of periods in variable stars is a very fun- 
damental issue in that they are used in many applica- 
tions, including classification of variable stars, calibration 
of the period-luminosity relations, determination of the 
pulsation modes, detection of stellar rotation, detection 
of exoplanet transits and determination of the preces- 
sion mode and frequency in accretion disks. These time- 
series data in practical astronomy are not usually sam- 
pled evenly in the time domain, and it is difficult to apply 
fast Fourier transform (FFT) directly as in other fields 
of science. Alternatively, there have been well-established 
methods based on discrete Fourier transform (Deeming 
1975) and on least-squares fit of sinusoids for unevenly 
spaced data (Lomb-Scargle Periodogram, Scargle 1982; 
Home, Baliunas 1986; Zechmcister, Kurster 2009). This 
type of method has been extended to extract finite num- 
bers of signals by subsequently subtracting the strongest 
signals (CLEAN; Roberts et al. 1987). 

Another class of approach has been taken by evaluating 
dispersions either in sum of lengths between phase-sorted 
data (Dworetsky 1983) or sum of dispersions in phased 
bins against trial periods. This class of approach in- 
cludes Laflcr, Kinman (1965), well-used Phase Dispersion 
Minimization (PDM, Stcllingwcrf 1978) and Analysis of 
Variance (AoV, Schwarzcnbcrg-Czerny 1989). 

Yet another class of approach to estimate the power 



spectrum is by assuming a sum of <5-function-like poles 
on a frequency space extended to the full complex space 
(Childers 1978, chap. 2; Kay, Marple 1981). This 
approach is generally referred to as Maximum Entropy 
Method (MEM) or autoregressivc model (AR). Although 
this approach is potentially very powerful in detecting a 
small number of sharp signals in the power spectrum, the 
usage in practical astronomical data has been limited be- 
cause the well-known quick algorithm (Burg algorithm) 
using a recursive technique can only be applied to evenly 
spaced data (Burg 1967). 

In recent years, there has been a remarkable progress in 
the field of Compressed Sensing, which deals with a class 
of problems in restoring or estimating sparsely scattered, 
finite number of parameters in a huge dimension. We 
present an application of this method to period analysis. 

2. Methods 

2.1. Decomposition to Fourier Components 

The following description basically follows the mathe- 
matical formulation reviewed in Tanaka (2010). [For a re- 
view of Compressed Sensing, see Donoho (2006)]. We as- 
sume a time-series data Y(ti) with unevenly spaced times 
of observations at ij. The data are assumed to be sub- 
tracted for their average, and the mean of Y is zero. The 
observation can be expressed as a sum of signal (Y s ) and 
random errors (n) : 



Y l =Y(t t ) = Y s (t l ) + n(t l ). 



(1) 
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We assume that the signal is composed of a sum of strictly 
periodic functions. Using sine and cosine Fourier compo- 
nents, Y s can be expressed as : 



y a (k) = Qj cos(ujjU) + / ] bj sin(ujjti) 



(2) 



where u are frequencies and a and b are amplitudes. The 
problem can be set to estimate a and b from Y. If the num- 
ber of different ui is larger than the half number of observa- 
tions, the equation cannot be solved uniquely. Rewriting 



,Y N f 



(3) 



and 



(ai, • • • ,ajvf,6i, • • • ,6m) T , 



(4) 

where N and Af are number of observations and number of 
different lo, respectively. We can rewrite a set of equations 
1 and 2 as a form of 

y = Ax + n (5) 

using a 2M x ./V observation matrix A composed of 



-4, 



cos(wjtj), (i < M) 
sm(uii- M tj), (i > M). 



(6) 



We define x as the vector to be estimated. 

The problem is to estimate sparse Xq, i.e., with only 
small number of non-zero elements, under the restriction 
of y = Ax, if the time-series data is expected to have only 
a few periodic components, which is often the case with 
period analysis. We define "0-norm" ||jc||o as the number 
of non-zero elements in xq. The problem can be then 
formalized to choose x : 



x^ ' = argmin||x||o subj. to y = Ax. 



(7) 



This estimate is called £q regularization, and is known 
to be very robust in this problem setting. The problem, 
however, is that £ Q regularization is computationally diffi- 
cult, and is known to be an NP-hard problem (Natarajan 
1995). 

In order to overcome this difficulty, the following alter- 
native p-norms are usually employed: 

(£LW P ) (1/P) , (p>o) 

||ai||o, (p = 0). 



(8) 



In particular, p—1 (£i) regularization has an advantage in 
that it can be solved by conventional ways: by introduc- 
ing a vector u = {u\, ■ ■ ■ ,mat) t composed of non-negative 
elements, the equation becomes 



N 

min U{ 



subj. to — u < x < u, y = Ax, 



(9) 



and it is a well-known linear programming (LP). Most no- 
tably, it has been recently demonstrated that this l\ reg- 
ularization is also the sparsest solution (£q regularization) 
in most large underdetermined systems (see e.g., Donoho 
2006; Candes, Tao 2006). 



There have been a number of methods using this l\ con- 
strained estimations. Among them, Tibshirani (1996) pre- 
sented a concept of least absolute shrinkage and selection 
operator (lasso). This estimation has become well-known 
after the development of a fast algorithm, known as least 
angle regression (LAR) by Efron et al. (2004). The lasso 
is defined as: 1 

argmin(F(a;) = — ||y-Aa;||^ + A||x||i), (10) 



-LAR 
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where A||a;||i is the £i-norm penalty function with a pa- 
rameter A having a value of A > 0. This estimate becomes 
identical with a least-squares estimation at A = 0. 

The optimal value of A may be chosen by using the 
cross-validation technique: after breaking the data into 
two groups, use the one group for lasso analysis and use 
the rest of data for evaluation of residuals. Mean squared 
errors (MSE) were evaluated using several randomly cho- 
sen different partitioning of data. One can then estimate 
A giving the smallest MSE. In application to analysis of 
actual time-series data, it might be helpful to survey the 
response of lasso using different A values, rather than fix- 
ing at A giving the smallest MSE, since the derived periods 
are not very different between different A values, and since 
smaller A gives smaller number of false signals, which is 
profitable when it is already known that only small num- 
ber of signals are present in the data. An example of 
cross-validation diagram is shown in subsection 3.3 (fig- 
ure 8). 

We must note that the present application of l\ regular- 
ization using sinusoidal functions is not unprecedented in 
astronomy, but that there was at least one in the prc-LAR 
era in analyzing radial-velocity data (Chen, Donoho 1998) 
based on the same scheme. We introduce the present 
method because time-series data in variable stars are com- 
monly met in practical astronomy and are very suitable 
targets for the present analysis, cither in detection of 
multiple periods or precise determination of the periods. 
Furthermore, the development of LAR algorithm and its 
public availability on a popular platform has enabled an 
application of the Compressed Sensing far easier to reach 
than in the time of Chen, Donoho (1998). 

2.2. Numerical Solution 

This LAR algorithm has been implemented as the pack- 
age lars in R software 2 and we used this package combined 
with glmnet (generalized linear model via penalized max- 
imum likelihood) as a wrapper, which also provides cross- 
validation results. Sec appendix for a sample code. 



In actual calculation of lasso estimates, we need to normalize 
parameters, i.e., y\ Ajj = 0, £V A* = 1, for j ' = 1, ■ ■ ■ ,2M 
(Friedman et al. 2010). We here introduce a conceptual formula 
and avoided complicated use of normalization factors. The glm- 
net package in R automatically performs this normalization and 
we usually do not need to consider the normalization. 
2 The R Foundation for Statistical Computing: 
<http://cran.r-projcct.org/>. 
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Fig. 1. VSOLJ observations of 6 Cep. 

3. Results 

3.1. Response to Artificial Data: Single Frequency 

We examined the response of this analysis to the artifi- 
cial data. In order to reflect typical astronomical time- 
series observations, we used observations of 5 Cep by 
Variable Star Observers' League in Japan (VSOLJ), span- 
ning 1906 through 2001, with a total number of 2955 vi- 
sual estimates having large gaps (1911-1944, 1946-1955) 
and inhomogcneous density of observations (figure 1). We 
presented response to different A values 3 as well as the 
model at the smallest MSE. We used the times of these 
observations and replaced the magnitudes with artificial 
data. The first one (figure 2) is a pure sine wave with a 
period of 10 d with Gaussian noises whose a is the same 
as the amplitude of the wave. In this case, the true pe- 
riod was always reproduced within errors of 10 -4 d. Only 
small noises appear when A is very small. The second case 
is with Gaussian noises five times larger than the ampli- 
tude of the wave (figure 3). In this case, the signal was 
completely lost when A is small (over-expression of the 
noises by Fourier components). This result indicates that 
we need to choose adequate A depending on the signal- 
to-noise ratio of the data. The model with the smallest 
MSE is found to be an adequate solution in this case. The 
third case is the asymmetric waveform. We added second 
overtone (a period of 5 d) having a half amplitude of the 
primary wave. The noises were as in the first case (figure 
4) . The estimates of the period were observed to slightly 
deviate from the correct value (typically 2 10 -4 d), and 
the deviation depends on A. We allowed two spectral win- 
dows 9.9-10.1 d and 4.95-5.05 d and obtained figure 5. 
After allowing the second window, the main period was 
more clearly detected. If cycle numbers are sufficient as in 
this case, asymmetric profiles did not significantly affect 
the results. 

3.2. Separation of Peaks 

We then examined the ability of this lasso estimate for 
identifying very closely separated signals. We analyzed 
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3 In referring to log A, we used natural logarithms throughout the 
paper. 
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Fig. 2. Analysis of a pure sine wave with noises. 

a combination of two periods of 10 d and 10.005 d and 
they were always detected as separate signals regardless 
of the different sequence of random numbers used to add 
noises (figure 6). Assuming an effective baseline of 15000 
d, these two periods correspond to a difference of 0.75 
cycle between the limits of the baseline. In a case of 10 
d and 10.001 d, we could resolve the signals in about half 
trials. It was impossible to resolve a combination of 10 
d and 10.0005 d. This degree of resolution is, naturally, 
far beyond the reach of conventional techniques, such as 
PDM and Fourier-type analysis. 

3.3. Analysis of 5 Cephei 

We performed an analysis of S Cep itself. The examined 
periods were 1000 equally spaced bins between 5.36 and 
5.37 d. Since the peaks were very sharp, we enlarged the 
figure to show the profile (figure 7). The cross-validation 
diagram is shown in figure 8, which indicates that a com- 
bination of 5-40 frequencies best describes the data. The 
resultant (strongest) period was 5.366326 d (in actual cal- 
culation, we used ten times smaller bins to obtain this pre- 
cision) . The period in General Catalog of Variable Stars 
(Kholopov et al. 1985) is 5.366341 d. The period obtained 
by PDM was 5.36629(4) d (the error was estimated by 
the methods of Fernie 1989 and Kato et al. 2010). Since 
lasso estimate is highly nonlinear as in MEM, it is dif- 
ficult to estimate the error of the obtained period. We 
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Fig. 3. Analysis of a pure sine wave with noises five times 
larger than the signal. 

alternatively performed delete-half jackknifing with ran- 
dom subsampling (Shao, Tu 1995): randomly selected 50 
% of observations and obtained periods from 10 different 
sets. The resultant 1-er standard deviation was 0.000016 
d. The estimate of the error appears to be consistent with 
the difference between our result and the GCVS period, 
and lasso estimate is several times more accurate than 
with the PDM in the present case. An addition of ran- 
dom typical errors (0.1 mag) of visual observations did 
not affect the result, indicating that lasso estimate is very 
robust in detecting strictly periodic signal under the pres- 
ence of large errors. 

3.4- Application to R Scuti 

R Set is a well-known RV Tau-type variable star, char- 
acterized by the presence of alternating deep and shallow 
minima. Many RV Tau-type variables exhibit irregulari- 
ties and the regular alternations of minima are often dis- 
turbed. R Set is notable in its irregular behavior among 
RV Tau stars (e.g. Kollath 1990; Buchler et al. 1996). 
Buchler et al. (1996) reported that Fourier decomposition 
reasonably well expresses the observed features of the light 
curve, but that its predictive power is limited. Since their 
Fourier decomposition was not based on l\ regularization, 
it would be interesting to see whether the periods selected 
by l\ regularization equally well describe the light curve 



Fig. 4. Analysis of non-sinusoidal wave. 

and whether the model based on these periods has a better 
predictive power. 

The data used were VSOLJ observations between 1906 
and 2001 (16767 points). We used two windows of peri- 
ods 60-80 d and 120-160 d. The strongest signals were at 
143.8 d and 70.9 d, associated with weaker signals (figure 
9). We used log A = —6.27, which is the most regularized 
model with a cross-validation error within one standard 
deviation of the minimum. Although the minimum MSE 
was reached by logA = —10.18, we did not adopt it be- 
cause it was dominated by false signals and because it 
is physically less likely that such a large number of pul- 
sation modes coexist in a radially pulsating star. It is 
noteworthy that the longer period is not the twice of the 
shorter period, in contrast to Fourier analysis (Kollath 
1990) or our own analysis with the PDM (142.1 d and 71.0 
d). These periods may represent fundamental and first- 
overtone periods close to the 2:1, but slightly different, 
resonance which is expected to explain the RV Tau-type 
phenomenon (Takcuti, Petersen 1983). 

The lasso model seems to adequately express observa- 
tions (figure 10), indicating that varying amplitudes are 
basically a result of a combination of waves with periods 
close to the 2:1, but slightly different, resonance. It would 
be noteworthy that interval with very low amplitudes (e.g. 
JD 2451000-2451600) is well expressed by this model. 

Figure 11 presents an example of lasso predictions. The 
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Fig. 5. Analysis of a non-sinusoidal wave, fitted with a sec- 
ond spectral window. 



Fig. 6. Analysis of closely separated double waves. 



model soon lost the predictive ability after the end of the 
epoch (JD 2452240) used for modeling. We also obtained 
a model to the same data restricted to JD before 2451000. 
Although the model well describes the observations used 
in the fit, it again loses the ability after the end of the 
epoch used for modeling (figure 12). These results are 
similar to that in Buchlcr ct al. (1996), and it is likely 
that a simple combination of multiple sinusoidal waves is 
not an adequate model for expressing the behavior of this 
object. This is probably caused by the chaotic nature of 
the pulsations in this star. 

3. 5. Splitting of Signals for Badly Phased Data 

Although equation (6)-type bases are mathematically 
sufficient for ordinary Fourier analysis, and this formula- 
tion for lasso analysis is adequate for many cases (with 
sufficient number of data and length of time-series), the 
lasso response to even a simple cyclic function can yield 
split signals due to the lasso's characteristics of its norm. 
This is problematic because it results in erroneous estima- 
tion of periods. Whereas we did not observe the problem 
in the examples presented so far, the problem can arise 
in certain applications of lasso-based methods to period 
analysis, and is therefore potentially harmful. We there- 
fore discuss in this subsection what precisely the problem 
is, and then propose a means to circumvent it. 



This situation can be understood by a simple example. 
If observed data has a form of y(t) = sm(ut), the lasso 
result has an expected form of y LAR (i) = csin(cji), unless 
A is too small. Although the amplitude c is different from 
1, we can get the expected frequency uj. If the observed 
data, however, has a form 

y(t) = sm((jt) + cos(ojt), (11) 

there are different solutions with equally acceptable 
squared residuals in a form of 

y(t) = cism(uxt) + c 2 cos(u)2t), (12) 

and 

|d| + N<2, Wj^2, (13) 

which is preferred by lasso, and the result is observed as 
two signals at uj\ and W2 around the expected signal uj. 
For a numerical example, a lasso application to the gen- 
erated data for t = 1,---,100 and u = 1/10 in equation 
(11) yields uji = 1/9.818 and uj 2 = 1/10.193 for any A in 
-2.58 < log A < -0.21. 4 The coefficients a = 0.7632 and 
c 2 =0.7545 yield ^\\y-Ax\\l =0.0039 and ||a;||i = 1.5177 

4 This split occurs regardless of the number of the data and length 
in time. Although the split is smaller with longer length in 
time, the deviation from the expected signal is larger than the 
accuracy of period determination with the PDM. 
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Fig. 8. Cross-validation diagram for 5 Cep. The points and 
error bars represent average mean-squared errors (MSEs) of 
cross-validation and their standard deviations, respectively. 
The left vertical dotted line represents the location of the least 
MSE. The right vertical dotted line represents the most reg- 
ularized model with a cross-validation error within one stan- 
dard deviation of the minimum. The numbers on the upper 
border of the figure represents numbers of non-zero coeffi- 
cients for each value of A. 



Fig. 7. Analysis of VSOLJ data of 5 Cep. 



in equation (10). 5 For u>i = oj- 2 = 1/10 and C\ = c 2 = 1, 
the corresponding values are and 2, respectively. For A 
larger than 0.008, the former set of parameters (with split 
signals) gives a smaller i ||y — Ax\\\ + \\x\\i. Although 
a very small A might improve the situation, such a small 
A will spoil the advantage of l\ regularization. 

Although this difficulty can be avoided by introduc- 
ing complex coefficients and exp(— MJi)-type basis (Candes 
et al. 2006; Rauhut 2007), this type of l\ regularization 
cannot be achieved by lars package. As an alternative, 
practical approach to reduce this false splitting of signals, 
we have introduced an N^M x N matrix in the following 
form: 

k 

AkM+i.j = cos(wjij + — tt), (14) 

where Nk is the number of phase bins and k = 0, • • • , — 1 . 
Although the equation (14)-type basis is not mathemat- 
ically independent (it is an "over-complete" basis or a 
"frame"), the non- linear response of lasso, caused by the 
characteristics of £i-norm, enables us to detect an ade- 
quate frequency and phase at the same time. The num- 

5 These ci and C2 were chosen to minimize I \y — Ax\ || in order 
to best illustrate the problem. The actual values of c\ and C2 
selected by lasso are dependent on A. For example, A = 0.095 
yields ci = 0.5488 and c 2 = 0.5599, giving ^ | \y-Ax\ || = 0.0399 
and ||£e||i = 1.1087. In this case, F(x) = 0.145 and is indeed 
smaller than F(x) = 0.190 for u>i = oJ2 = 1/10 and c\ = c 2 = 1. 
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Fig. 10. A segment of lasso modeling to observations of R 
Set. The points represent observations and the curves repre- 
sent a lasso model. 



Fig. 12. A segment of lasso modeling to observations of R 
Set. The data are the same as in figure 10. The points repre- 
sent observations and the curves represent a lasso model fitted 
for the data before JD 2451000. 
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Fig. 11. Lasso predictions of R Set. The points represent 
VSNET (Kato et al. 2004) observations and the curves repre- 
sent a lasso model. The model soon lost the predictive ability 
after the end of the interval (JD 2452240) used for modeling. 



ber Nk can be determined experimentally, and the for- 
mulation becomes equivalent to equation (6) in the case 
of Nk = 2. In many of randomly sampled and randomly 
phased actual data, there is no special need to use Nk > 2. 
Even in most extreme cases, we have found that Nk = 40 
successfully reproduced the original spectrum. 

We used an artificial data x = (1, • • • , 100) T and used a 
function of y(x) = sin(27ra;/10 + 0.63) to produce y. The 
phase offset of 0.63 was chosen to produce the most unfa- 
vorable condition. The frequency window corresponds to 
periods of 9.9-10.1. The effect of different Nk on this arti- 
ficial data is shown in figure 13. Since our present aim is to 
determine the single period precisely, we did not employ 
cross-validation, which increases the number of detected 
signals, but used a fixed log A = —2.6. In the case of Nk = 2, 
the reproduced signals are located on the upper and lower 
limits of the tested range. In the case of Nk = 4, the sepa- 
ration between split peaks became smaller, and we could 
obtain a merged signal with Nk = 40. With an even larger 
Nk, false peaks were more prone to appear. 

Using y(x) = sin(27ra;/10 + 7r/4), which corresponds to 
the example in equation 11, we could obtain a reason- 
able result (single signal) with Nk = 4. The experiment 



8 



T. Kato and M. Uemura 



0.25 

a 0.15 
ft 

.£ °- 10 

« 0.05 
K 

0.00 



A- 



1 



10.00 
Period 



Fig. 13. Effect of on badly phased data. 

suggests that false splitting of signals is expected to be 
avoided with Nk up to 40, even in the worst cases. 

4. Conclusion 

We introduced lasso in obtaining periodic signals in un- 
evenly spaced time-series data. A very simple formulation 
with a combination of a large set of sine and cosine func- 
tions has been shown to work very well, and the peaks in 
the resultant power spectra were very sharp. We studied 
the response of lasso to low signal-to-noise data, asym- 
metric signals and very closely separated multiple signals. 
When the length of the observation is sufficiently long, all 
of them were not serious obstacles to lasso. We analyzed 
the 100-year visual observations of S Cep, and obtained a 
very accurate period of 5.366326(16) d. The error in pe- 
riod estimation was several times smaller than in PDM. 
We also modeled the historical data of R Set, and obtained 
a reasonable fit to the data. The main two periods (143.8 
d and 70.9 d) were not exactly in a 2:1 ratio. The model, 
however, lost its predictive ability soon after the end of 
the data used for modeling, which is probably a result of 
chaotic nature of the pulsations of this star. We also for- 
mulated a scheme by using different set of phases in the 
basis, and confirmed that this scheme is useful when the 
phases of observed data are most unfavorable for lasso. 
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Appendix. Sample R code 

We provide a sample R code to make lasso analy- 
sis of time-series data. We assume that the data t = 
(ii, • • • , ijv) T , V are stored as two elements VI and V2 
in the data frame d. The variable p is a vector of as- 
sumed periods (for convenience, we provide a function 
seqf req to make a series of periods evenly spaced in the 
frequency space). The lasso estimate can be obtained 
by pow <- perlasso(d,p) . The parameter ndiv corre- 
sponds to Nk in the text, and the parameter cv is a flag 
whether to compute cross-validations. The results can be 
plotted by plot (pow, n), where n represents the A bin 
(the values of A are stored in pow$lambda). The result 
of cross-validation is stored in the list element of gcv and 
the element nmin is the A bin giving the smallest MSE. 
The function minper returns the strongest signals and the 
function lassof it presents a fit to the data. 

library (lars) 
library (glmnet ) 

seqfreq <- function(a,b, . . . ) { 
return (1/ seq(l/b, 1/a, . . . ) ) 

} 

makematlasso <- function(d,p,ndiv) { 
nd <- length (p) 

m <- matrix (0 ,nrow(d) ,nd*ndiv) 
for (i in l:nd) { 

ph <- ((d$Vl/p[i]) •/./. l)*pi*2 

for (j in 0: (ndiv-1)) { 

m[,i+nd*j] <- sin(ph+pi*j/ndiv) 

} 

} 

return (m) 

} 

perlasso <- f unction(d,p,ndiv=2, alpha=l , 
cv=FALSE) { 
nd <- length (p) 
mat <- makematlasso (d,p,ndiv) 
y <- d$V2 - mean(d$V2) 
m <- glmnet (mat , y , alpha=alpha) 
ndim <- m$dim[2] 
pow <- matrix(0,nd,ndim) 
for (i in l:ndim) { 
v <- m$beta[,i] 
for (j in 0: (ndiv-1)) { 
pow[,i] <- pow[,i] + 

v[(nd*j + l) : (nd*(j+l))]~2 
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lassofit <- function(pow,x,n,ndiv=2) { 

d <- data. frame (Vl=x,V2=numeric (length(x) ) ) 
m <- pow$mat 

return(m '1*7, pow$m$beta[,n] ) 

} 

lassoexpect <- function(pow,x,n,ndiv=2) { 

d <- data. frame (Vl=x,V2=numeric (length (x) ) ) 
m <- makematlasso(d,pow$p,ndiv) 
return (m '/,*'/, pow$m$beta [ ,n] ) 
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} 

} 

nmin <- NULL 
gcv <- NULL 
if (cv) { 

gcv <- cv . glmnet (mat , d$V2 , alpha=alpha) 
mini <- gcv$lambda.min 
nmin <- which. min(abs( 

m$lambda-gcv$lambda.min) ) 

} 

r <- list(pow=pow,p=p,lambda=m$lambda, 

nmin=nmin , m=m , gc v=gc v , mat=mat ) 
class(r) <- c ("lassopow" , class(r) ) 
return (r) 

} 

plot . lassopow <- function(pow,n, . . . ) { 
p <- pow$p 
pw <- pow$pow 

plot(p,pw[,n] ,typ="l" ,xlab="Period" , 
ylab="Relative Power" , . . . ) 

} 

minper <- function(pow,n,num=l) { 
p <- pow$p 
pw <- pow$pow 
pp <- numeric () 
while (num > 0) { 

i <- which. max (pw[,n] ) 

pp <- c(pp,p[i]) 

num <- num-1 

pw[i,n] <- 

} 

return (pp) 

} 



